home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Tech Arsenal 1
/
Tech Arsenal (Arsenal Computer).ISO
/
tek-02
/
sort2.zip
/
SORT.PAS
< prev
next >
Wrap
Pascal/Delphi Source File
|
1993-01-04
|
10KB
|
246 lines
{ SORT: merge and sort multiple text files, replaces DOS SORT }
{ Copyright, 1988, 1989, by J. W. Rider }
{ Syntax :
SORT [options] [<unsorted-file-spec> ... ]
Where available options are:
"/r" reverses the sense of the sort,
"/"+# sorts the lines from the data in column #
-- a second # defines the last column of the key field.
-- subsequent #'s are ignored.
"/b" ignores leading blanks (spaces, tabs) in determining
the key.
"/c" makes the sort case-insensitive 'a'='A',
"/d" "dictionary" vs "ascii" sort, alphanumerics count only
"/f" interpret column numbers as "awk" field numbers,
-- does not automatically assume "/b"
"/h" displays help message rather than sort input.
"/k" outputs only the key not the whole line.
"/n" sorts the lines numerically vice alphabetically.
-- "/n" automatically assumes "/b"; in fact, "/n" will search
-- an entire key field for any hint of a numeric. "DOS1," "DOS2",
-- "DOS3.3" will all be correctly sorted.
"/t"C makes "C" a field delimiter vice blanks. To include
blanks, use "/t" without any character.
"/u" eliminates multiple copies of identical lines
-- "/u" might not work correctly if keys other than the whole
-- original line are specified: "/+#","/c","/n","/b"
If first filename is missing or is '-', reads from standard input.
Writes sorted lines to standard output.
The first two options, "/r" and "/"+#, and default use of standard
input and output are provided for syntax compatibility with MSDOS
SORT. The other options and command line file-naming are extensions that
are inspired from Unix implementations of SORT. }
{ If the heap is not large enough to completely hold the sorted file,
or if there is a problem with input/output file names, then
SORT displays an error message to 'CON' and returns ERRORLEVEL 1
to the parent process. }
{ Even if there is not enough room in the heap to sort the file in
memory, sort tries to provide a partial sorting of the file. The
output can be further sorted by sort until the file is completely
sorted. }
{$A+,B-,D+,E-,F-,I-,L-,N-,O-,R-,S-,V-}
{$M 16384,0,655360}
program sort;
uses dos; { added to facilitate wildcards in filenames }
{ In fact, my personal SORT utility "uses" considerably more units than
what is indicated here. However, my units are not standard. I would
not expect the average user to have ever heard of them. Nor would I expect
the advanced user to even *want* to use them. Instead, I have extracted
the components that SORT references and 'included' them instead. }
const grain = 16 ; { heap granularity; usage here requires power of two }
defaultcase = true; { some SORTs start out with different
case sensitivity. Change it here. TRUE means case sensitive. }
{ Granularity of heap is set to 'grain' bytes, see Turbo Pascal
Reference Guide, pg 199.
SYSTEM.FREEMIN is also set to 16000 in SORTINIT. No investigation
has been made as to whether or not these values are optimal.
Failure to set FREEMIN large enough will cause a run-time error
for files that are too large. }
{$I sort.typ } { Defines the binary tree records. }
{$I sort.var } { Defines all global variables. Includes
procedure "sortinit".}
{ General functions: Some of these functions are written in such a way as
to be generally useful. }
{$I anstr.fun } { Strip all non-alphanumerics from string }
{$I posnum.fun } { Searches a string for a numeric substring. }
{$I bval.fun } { Extract a number from a string }
{$I errexit.inc } { Type message; Set ErrorCode on exit. }
{$I findfld.fun } { Find starting pos for "awk" field in string }
{$I heapmem.fun } { Some "suggested" mods to GetMem and FreeMem. }
{$I isatty.fun } { Determines whether input has been redirected. }
{$I iswild.fun } { Does a string contain either "*" or "?" }
{$I lcase.fun } { Changes all upper case chars in string to lower }
{ Special functions: These functions are unique to SORT and have
questionable utility outside of this package. }
{$I btsort.inc } { Binary tree manipulation, output included }
{$I sortargs.inc } { Handle option switches from command line. }
{$I sorthlp.fun } { Displays the Sort Help Message. }
{$I stdinhdr.inc } { Prompt user for data if input not redirected }
{ Process1file: is what it is all about. The text variable "fi" has
been previously assigned to same named file. This procedure starts
from the beginning of the file, reads each line and stores it into the
binary tree structure until no more lines can be read. The file "fi" is
closed when we are done. }
procedure process1file; begin reset(fi);
while not eof(fi) do begin readln(fi,s); storeln(s); end;
close(fi); end;
{ MAIN: Most of this is abstracted from studies that I have made concerning
a standard method of handling multiple-arguments and filenames in
standard DOS filters. The general approach to decompose the large task
of merging multiple files into a series of single file tasks. This
skeleton can be modified to handle arguments in another manner. }
begin { sort main }
{ SORT title line: If there are no arguments on the command line and
standard input has not been redirected from a file, then we assume
that the user may not be completely certain as the proper method of
using SORT. The program provides a little message that indicates
how further help may be obtained. The message is not sent to
standard output; the user will not be incovenienced if he really
does know what he doing. }
if (paramcount=0) and isatty(0) then begin
assign(fe,'CON'); rewrite(fe); writeln(fe,
' SORT: Copyright 1988,1989, by J. W. Rider, use "SORT /h" for help.');
close(fe); end;
{ SORT INITialization: Initializing variables in this manner is
time-consuming, but the cost is trivial for sorting files of even
moderate size. My goal for the final program is to have these
variables as typed constants. }
sortinit;
{ Get command line ARGUMENTS: This version of SORT requires all option
switches be positioned before any file names. Once filenames have
started being read, no options can be changed. }
arguments;
{ Needs Help?: If the user specifies that help is desired or if an error
is made in the command line option switches, then just list the help
page and quit the program without error. }
if helponly then begin helpmsg; close(output); close(fi); exit; end;
{ Key fields: If the user has not specified a subset of cols for the
key, use the whole line. }
if keycol=0 then keycol:=1; if keycol2=0 then keycol2:=255;
{ No file names: If not input files are specified, use standard input
as the source }
if parmcount>paramcount then begin
{ If input has not be redirected, provide a little more instruction
on how to get the sort to work correctly. In any case, just handle
standard input like it was any other file. }
stdinhdr; assign(fi,''); process1file; end
{ otherwise merge in each file listed on the command line }
else for i:=parmcount to paramcount do
{ Use standard input if the command line filename is "-". }
if paramstr(i)='-' then begin stdinhdr; assign(fi,''); process1file; end
{ Otherwise, open each file individually. }
else begin
{ get complete file name and extension for entry }
fstr:=fexpand(paramstr(i)); fsplit(fstr,d,n,x);
{ If a directory is referenced, merge all included files }
if (n='') and (x='') then fstr:=fstr+'*.*';
{ Search for all reasonable files. Be sure to include
directories. }
findfirst(fstr,directory+readonly+archive,sr);
{ My preference for SORT was to ignore any attempt
by the user to sort non-existant files. (This could
be modified to detect such attempts. I just decided that
there was little that my program could tell me about what
files I wanted to sort.) }
while doserror=0 do begin
assign(fi,d+sr.name);
if (sr.attr and directory)<>0 then
{ Search subdirectories only if they are specifically
named. Do not perform recursive subdir searches. }
if not iswild(fstr) then begin
{ This time through, it is safe to ignore directories }
fstr:=fstr+'\*.*';
fsplit(fstr,d,n,x); findfirst(fstr,readonly+archive,sr);
while doserror=0 do begin assign(fi,d+sr.name);
process1file; findnext(sr); end; end
{ Ignore ambiguous directories }
else findnext(sr)
{ Merge all non-directory files found. }
else begin process1file; findnext(sr); end; end end;
{ after all files have been read, write the sorted tree out }
retrieveln; close(output); { IMPORTANT!: close output before exit }
{ If the program is unable to guarantee that the output has been correctly
sorted, an message is generated to the console and a DOS error return is
invoked. At worse, the output will be "partially" sorted. (Whatever
*that* might mean.) }
if sorterror then errexit('Output may not be completely sorted.');
end. {program sort}